History
Linux is an open-source OS
It's usually packaged into distributions that all use the same Kernel but add other open-source system software and libraries
Kernel was developed by Linus Torvalds
- Inspired by Minix (itself based on Unix)
- Packaged with GNU versions of Unix System Software
Linux System Architecture
- User Mode
- User Software
- System Components
- Daemon Processes
- Window Manager
- Graphics
- API Libraries
- C Standard Library (~2000 subroutines)
- Kernel Mode
- Kernel System Call Interface (SCI) (~380 system calls)
- Kernel (Processes, Memory, Files, Devices, Networking) & Kernel Modules (Device Drivers, etc.)
- Hardware
System Calls
User programs need to access I/O devices, to interact with keyboard/mouse/disk/network, however the protection ring prevents direct access
OSs provide a collection of system calls
- Often implemented as an interrupt
- Along with other useful functions, these form the application programmer interface (API) of the OS
Library code is included of imported at the top of source code
- Provides a wrapper that makes system calls look like ordinary subroutine calls
- Libraries (user mode) use system calls (kernel mode) to carry out privileged tasks
Kernel Modules
The kernel is privileged (in every OS); it has no restrictions for any action
Device drivers also need privileged access to hardware
Linux has trusted device drivers built into the kernel itself
| Monolithic Kernel | Modular Kernel | |
|---|---|---|
| Advantages | - All drivers included when kernel is compiled | - Specific drivers loaded when the system boots up |
| Disadvantages | - Kernel image is very big on disk and in memory - Need to recompile kernel to add new drivers or functionality |
- Fragmentation of kernel memory as file systems and modules are loaded - Security and stability risk from loading bad modules |
Graphical and Terminal Shells
The original Unix/Linux shell was purely text based
- Type a command and press enter
- See result as text in a single scrolling terminal
Newer Linux distributions include various graphical shells
- WIMP (Windows, Icons, Menus, Pointer)
- GUI (Graphical User Interface)
- User can install and boot into their preferred desktop shell (KDE, Gnome, etc.)
- Shell and Window Manager run as user level processes
The Uni's Linux Farm
Sixteen servers called lxfarm01, lxfarm02, ... , lxfarm16
Connect via lxfarmXX.csc.liv.ac.uk
Each server has 8 physical CPUs (each with 4 cores)
Use MWS login details, may need to be on campus or using the VPN service
Directories
There are four special directory names
~- Home directory.- Current directory..- Parent of current directory/- Root directory
Paths can be absolute (from root) or relative (to where you are)
File Permissions
Permissions are shown as a 10 character string split into four parts:
- eg.
drwxr-x---or-rw-r--r-- - First character indicates directories and other special files
- Next three characters for user permissions (read, write, execute)
- Then three for group permissions
- Then three for other permissions (ie. everyone on the system)
Every user belongs to a group (can be in multiple groups)
Every process is owned by a user and a group (even system processes)
Root User
The user root has full access to everything
Many system files and background processes are owned by the root user
Root user permissions can be requested by users using the sudo command
Setting Octal and Mnemonic Permissions
Use the chmod command to change the permissions of a file
To add write permission for the group: chmod g+w filename
To remove read permission for other: chmod o-r filename
You can also use octal numbers to change permissions quickly
| (add up) | User | Group | Other |
|---|---|---|---|
| Read | 4 | 4 | 4 |
| Write | 2 | 2 | 2 |
| Execute | 1 | 1 | 1 |
| Most files are set to 640 and most directories to 750 |
- 640 is
-rw-r-----(user read/write, group read) - 750 is
drwxr-x---(user read/write/exec, group read/exec)
Assembly in Linux
Assembly code can be written and compiled in Linux with NASM
- Put pure assembly code into a text file
- Use interrupts to trigger system calls to read, write, exit, etc.
- Compile with command line options
- Link into an executable with further options
This is much more complicated than using Visual Studio
To compile and run assembly saved inhello.asm
nasm -f elf32 hello.asm
ld -m elf_i386 hello.o -o hello
./hello
Code Example
Some code to display some text would look like:
global _start ; Tell linker where to start
section .data ; Constants go in .data section
msg db 'Hello World!', 10, 0 ; db = define bytes (10 = \n)
len dd 13 ; dd = define dword (32 bits)
section .text ; Code goes in .text section
_start:
mov eax, 4 ; 4 = sys_write
mov ebx, 1 ; 1 = file handle for STDOUT
mov ecx, msg ; Address of string
mov edx, [len] ; Length of string
int 0x80 ; Trigger interrupt
mov eax, 1 ; 1 = sys_exit
int 0x80 ; 0x80 is 128 decimal
Process Creation
Processes are created (spawned) by other processes
The original process is the parent, the new process is the child
All running processes form a tree structure (which can be shown in Linux using the pstree command)
Everything has systemd as the top-level ancestor
There are several system calls in Linux that allow a process to spawn a child:
- exec()
- Allows the process to execute another process
- Child replaces (overwrites) parent in memory and PCB
- fork()
- Spawns a new clone of the process
- Both parent and child continue to run
- wait()
- Called by parent process
- Blocks until child process terminates
Fork
The fork() system call returns one of three possible values
- <0 (negative) - If the child could not be created (failure)
- =0 (zero) - In the child process
- >0 (positive) - In the parent process (child process ID)
int pid = fork();
// Call returns in each of the two processes
if(pid == 0) {
printf("I'm the child process");
// Will usually call exec() to load its own code
} else {
printf("Im the parent and my child's ID is %d", pid);
}
The First Process
ROM stores a small program that runs a bootloader when a system is first turned on
- Linux systems use GRUB (GNU Grand Unified Bootloader)
- Loads kernel image from disk and starts fetch-execute cycle from first instruction
The first process to run is called systemd - It's process ID (PID) is 1
- Spawns all other processes required by the kernel
- Can be configured for various targets (eg. server or desktop)
Continues to run as a background process (daemon)
- Offers on-demand spawning of other services
- Maintains logfiles to record system activity
- Keeps track of other processes and kernel settings
Shell Login
The sshd daemon runs in the background, waiting for incoming connections (spawned by systemd)
- A ssh client is used to connect to a Linux server
- The sshd daemon uses fork() to spawn a child process
- The child uses exec() to run a login process
- The login process checks credentials
- Then it uses exec() to run preferred shell process
Processes are being created all the time
- Everything is done via fork() and exec() system calls
- This is the same concept for Windows and any other major OS
Running a Shell Command
When a command is type into the shell:
- The shell uses fork() to create a child process
- That child uses exec() to run the command we typed in
- (This can be shown with the
pscommand)
A desktop shell (GUI) does the same thing but with clicks to spawn processes
Zombies and Orphans
Parent processes usually wait for their children to die
- Processor manager will tell the parent that the child terminated
- Sends a SIGCHLD signal to the parent
- Clean-up is not done until the parent acknowledges it no longer needs its child
If the termination of a child is not acknowledged by the parent: - The child becomes a zombie
- It has finished but is still present in the process table
If a parent terminates before its children (eg. crashes) - The children become orphans
- They are adopted by the systemd process
The systemd process periodically cleans up zombies and orphans in the process table
Daemon Processes
Daemon processes such as sshd and systemd
- Usually the process name ends with 'd' to signify daemon
- Not associated with shell or any user
- Runs permanently in the background
Perform the background operations of the OS - Subsystem managers are daemon processes
- Need their own time on the CPU, so must be scheduled
- Usually run with a higher priority
Perform tasks requested by other processes
Processes in the Linux File System
Kernel stores housekeeping information in the /proc directory
- Dynamic details about the current state of the kernel
- Virtual file system called procfs (ie. not all real files on disk)
- Subdirectory for each running process
For example:
| Subdirectory | Purpose |
|---|---|
| /proc/PID/ | Stores all the details and status of process PID |
| /proc/PID/cmdline | The text that was typed to start the process |
| /proc/PID/fdinfo | The status of any open files used by the process |
| /proc/PID/status | The overall status of the process (lots of detail) |
| /proc/cpuinfo | Stores details about the physical CPU(s) |
| /proc/modules | Stores info about currently loaded kernel modules |
There is a top command to see a dynamic (real-time) display of process details
Linux Signals
A process running from the terminal can usually be terminated by typing ^C (CTRL+C)
- This sends an interrupt signal to the process
- Process intercepts signal and responds by terminating
Signals can be sent between two processes with the signal() system call
We can send signals with the kill command at the shell prompt - There are various signals denoted by numbers and codes
- For example, to terminate process 438
kill -s SIGKILL 438
Responding to Signals
The process that receives a signal can respond in three ways
- Perform the action requested
- Ignore the signal completely
- Catch the signal and run some other arbitrary code
The only signal that cannot be ignored or caught is SIGKILL (9)
| Code | Number | Meaning |
|---|---|---|
| SIGINT | 2 | Interrupted from keyboard (via CTRL+C) |
| SIGKILL | 9 | Request to terminate process (cannot be ignored) |
| SIGTERM | 15 | Request to terminate process (might be ignored) |
| SIGCHLD | 17 | Indicates that a child process has terminated |
| SIGIO | 29 | Indicates that input or output is ready |
Terminating Zombie Processes
Imagine a parent process is badly coded:
- Spawns a child via fork()
- But does not acknowledge when that child terminates (via wait() or SIGCHLD)
The child will become a zombie
- Hangs around in process table doing nothing
- Will take up some (minimal) system resources
You can try to resend SIGCHLD to the parent but it will probably ignore it
So you'll have to send SIGKILL to the parent to kill it
- The parent will be terminated and systemd will adopt the zombie
- Zombie will be cleaned up via the next periodic check
Inter-Process Communication (IPC)
Processes need to communicate with each other
- To share, send, and receive data
- To provide services (servers) to other processes (clients)
Types of IPC - Shared memory
- Shared files
- Pipes
- Sockets
Shared memory and shared files allow two processes to access the same memory location or file at the same time
- Introduces synchronisation issues
- Needs to be coordinated by semaphores and locks
Pipes
A pipe is a form of IPC between two children of the same parent
Usually triggered by typing a command at the prompt
- Join two processes with the
|(pipe) symbol - Output from the first becomes input to the second
For example, you can list all kernel modules withcat /proc/modulesand you can count the lines of a file withwc --lines filename
So you can find out how many kernel modules are running withcat /proc/modules | wc --lines
Sockets
A socket is a form of IPC that can span multiple systems
- One process is the server (daemon) listening for clients
- Other process is the client that connects to the server
- Communication is bidirectional (both can send/receive)
- Implemented as special files in the file system
The processes don't need to be on the same machine - Don't need the same parent process (unlike pipes)
- Typically provide internet services
- Server (eg. httpd, maild, sshd) is running as a daemon
- Clients connect when they need to get/send data
Client-Server Socket Handling
Server process uses the listen() system call to wait for clients
- When a client connects, usually spawns child via fork()
- Child process handles the communication then terminates
- Parent process puts accept() in a loop to handle multiple incoming clients
- The socket looks like a local file to the fread() and fwrite() system calls
There is a diagram for this at the end of lecture 12